Normalization of Gender, Dialect and Speaking style using Probabilistic front-ends

نویسندگان

Udhyakumar Nallasamy

Florian Metze

Thomas Schaaf

چکیده

This paper analyzes the capability of probabilistic Multilayer Perceptron (MLP) front-end to perform various normalizations for robust Automatic Speech Recognition (ASR). We find decision trees to be a useful tool for investigating the normalization of the feature space achieved by various front-ends. We introduce additional questions for different environmental conditions to the training of the phonetic context decision tree, and count the number of splits dedicated to lexical discrimination using context, and to these environmental conditions. We compare (1) BottleNeck (BN) features and (2) standard stacked Mel Frequency Cepstral Coefficients (MFCC) with LDA. In previous work, we found the BN front-end to be effective in reducing the number of gender questions than MFCC, which may be part of the reason why BN front-ends can achieve significant improvements. In this work, we extend this approach to the analysis of dialect on a large database of Pan-Arabic speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Percentage of Consonants Correct for 3-5 Years Old Kurdish-Speaking Children With Middle Kurmanji-Mukryani Dialect

Objectives: The present research aims to study the normal development of Percentage of Consonant Correct (PCC) in Kurdish-speaking children, with Middle Kurmanji-Mukryani Dialect as an Articulation Competency Index (ACI). PCC was examined in terms of the manner of articulation and position of sound in the word.  Methods: In this descriptoanalytical cross-sectional study, 120 Kurdish-speak...

متن کامل

Feature Level Compensation for Robust Speaker Identification in Mismatched Conditions

In this paper, robust front end features are proposed for improvement in speaker identification (SI) performance by considering the factors of real world situations, like mismatch between training and testing conditions. The most commonly used MFCC features are very much sensitive to effects such as channel and environment mismatch. Characteristics of speech gets changed with room acoustics, ch...

متن کامل

Dialect Variation in Speaking Rate

The difference in speaking rates among American English regional dialects have been assumed and become popular belief in U.S. culture without supporting evidence to prove or disprove it. This study compares the speaking rates of those in south-central Wisconsin and western North Carolina in order to see if southerners do, in fact, speak more slowly than northerners. Age and gender are also comp...

متن کامل

Perceptual compensation for differences in speaking style

It is well-established that listeners will shift their categorization of a target vowel as a function of acoustic characteristics of a preceding carrier phrase (CP). These results have been interpreted as an example of perceptual normalization for variability resulting from differences in talker anatomy. The present study examined whether listeners would normalize for acoustic variability resul...

متن کامل

Transcribing radio news

We have recently extended the capabilities of BBN's large vocabulary discrete-utterance speech recognition system (BYBLOS) to operate on raw audio recordings of radio news programming. The recordings are given to the system as large monolithic waveforms without any additional sideinformation. Our goal is to transcribe all speech in the input with the highest accuracy possible. The problem is ve...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Normalization of Gender, Dialect and Speaking style using Probabilistic front-ends

نویسندگان

چکیده

منابع مشابه

Percentage of Consonants Correct for 3-5 Years Old Kurdish-Speaking Children With Middle Kurmanji-Mukryani Dialect

Feature Level Compensation for Robust Speaker Identification in Mismatched Conditions

Dialect Variation in Speaking Rate

Perceptual compensation for differences in speaking style

Transcribing radio news

عنوان ژورنال:

اشتراک گذاری